lazily aggregated quantized gradient
Communication-Efficient Distributed Learning via Lazily Aggregated Quantized Gradients
The present paper develops a novel aggregated gradient approach for distributed machine learning that adaptively compresses the gradient communication. The key idea is to first quantize the computed gradients, and then skip less informative quantized gradient communications by reusing outdated gradients. Quantizing and skipping result in'lazy' worker-server communications, which justifies the term Lazily Aggregated Quantized gradient that is henceforth abbreviated as LAQ. Our LAQ can provably attain the same linear convergence rate as the gradient descent in the strongly convex case, while effecting major savings in the communication overhead both in transmitted bits as well as in communication rounds. Empirically, experiments with real data corroborate a significant communication reduction compared to existing gradient-and stochastic gradient-based algorithms.
Reviews: Communication-Efficient Distributed Learning via Lazily Aggregated Quantized Gradients
The paper extends the lazily aggregated gradient (LAG) approach by applying quantization to further reduce communication. In the original LAG approach, workers only communicate their gradient with the central coordinator if it is significantly different from its previous one. In this paper, the gradients are compressed using quantization and workers skip communication if their quantized gradient does not differ substantially from previous ones. For strongly convex objectives, the paper proves linear convergence. The paper is very well written and the approach is clearly motivated, easy to understand, and discussed in the context of related work.
Communication-Efficient Distributed Learning via Lazily Aggregated Quantized Gradients
The present paper develops a novel aggregated gradient approach for distributed machine learning that adaptively compresses the gradient communication. The key idea is to first quantize the computed gradients, and then skip less informative quantized gradient communications by reusing outdated gradients. Quantizing and skipping result in'lazy' worker-server communications, which justifies the term Lazily Aggregated Quantized gradient that is henceforth abbreviated as LAQ. Our LAQ can provably attain the same linear convergence rate as the gradient descent in the strongly convex case, while effecting major savings in the communication overhead both in transmitted bits as well as in communication rounds. Empirically, experiments with real data corroborate a significant communication reduction compared to existing gradient- and stochastic gradient-based algorithms.
A-LAQ: Adaptive Lazily Aggregated Quantized Gradient
Mahmoudi, Afsaneh, Júnior, José Mairton Barros Da Silva, Ghadikolaei, Hossein S., Fischione, Carlo
Federated Learning (FL) plays a prominent role in solving machine learning problems with data distributed across clients. In FL, to reduce the communication overhead of data between clients and the server, each client communicates the local FL parameters instead of the local data. However, when a wireless network connects clients and the server, the communication resource limitations of the clients may prevent completing the training of the FL iterations. Therefore, communication-efficient variants of FL have been widely investigated. Lazily Aggregated Quantized Gradient (LAQ) is one of the promising communication-efficient approaches to lower resource usage in FL. However, LAQ assigns a fixed number of bits for all iterations, which may be communication-inefficient when the number of iterations is medium to high or convergence is approaching. This paper proposes Adaptive Lazily Aggregated Quantized Gradient (A-LAQ), which is a method that significantly extends LAQ by assigning an adaptive number of communication bits during the FL iterations. We train FL in an energy-constraint condition and investigate the convergence analysis for A-LAQ. The experimental results highlight that A-LAQ outperforms LAQ by up to a $50$% reduction in spent communication energy and an $11$% increase in test accuracy.
- Europe > Sweden > Stockholm > Stockholm (0.04)
- North America > United States (0.04)
- Asia > China (0.04)
Communication-Efficient Distributed Learning via Lazily Aggregated Quantized Gradients
Sun, Jun, Chen, Tianyi, Giannakis, Georgios, Yang, Zaiyue
The present paper develops a novel aggregated gradient approach for distributed machine learning that adaptively compresses the gradient communication. The key idea is to first quantize the computed gradients, and then skip less informative quantized gradient communications by reusing outdated gradients. Quantizing and skipping result in'lazy' worker-server communications, which justifies the term Lazily Aggregated Quantized gradient that is henceforth abbreviated as LAQ. Our LAQ can provably attain the same linear convergence rate as the gradient descent in the strongly convex case, while effecting major savings in the communication overhead both in transmitted bits as well as in communication rounds. Empirically, experiments with real data corroborate a significant communication reduction compared to existing gradient- and stochastic gradient-based algorithms.